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ABSTRACT 

Extreme deconvolution (XD) of broad-band photometric data can both separate stars from 
quasars and generate probability density functions for quasar redshifts, while incorporating 
flux uncertainties and missing data. Mid-infrared photometric colors are now widely used 
to identify hot dust intrinsic to quasars, and the release of all-sky WISE data has led to a 
dramatic increase in the number of IR-selected quasars. Using forced-photometry on pub¬ 
lic WISE data at the locations of SDSS point sources, we incorporate this all-sky data into 
the training of the XDQSOz models originally developed to select quasars from optical pho¬ 
tometry. The combination of WISE and SDSS information is far more powerful than SDSS 
alone, particularly at z > 2. The use of SDSS-l-lT/^fi photometry is comparable to the use 
of SDSS-fultraviolet-fnear-lR data. We release a new public catalogue of 5,537,436 (total; 
3,874,639 weighted by probability) potential quasars with probability Pqso > 0.2. The cata¬ 
logue includes redshift probabilities for all objects. We also release an updated version of the 
publicly available set of codes to calculate quasar and redshift probabilities for various combi¬ 
nations of data. Finally, we demonstrate that this method of selecting quasars using WISE data 
is both more complete and efficient than simple WISE color-cuts, especially at high redshift. 
Our fits verify that above z ^ 3 WISE colors become bluer than the standard cuts applied to 
select quasars. Currently, the analysis is limited to quasars with optical counterparts, and thus 
cannot be used to find highly obscured quasars that WISE color-cuts identify in significant 
numbers. 

Key words: methods: data analysis; catalogues; galaxies: active; galaxies: distances and 
redshifts; galaxies: photometry; (galaxies:) quasars: general 


1 INTRODUCTION 

As newer and larger imaging surveys are conducted over more area 
and frequency ranges, photometric classification of quasars using 
as much available information as possible is becoming increas¬ 
ingly important. Spectroscopic follow-up of complete samples will 
pose greater and greater technical challenges as the depth of imag¬ 
ing surveys increases. Studies of photometric quasa r samples have 
already shed light on the full quasar population ( Scranton_et^ 


20051: foiannantonio et al.ll2006Ll2008l:lMvers et alj|200^. I2007all5 


Hickox et al. 201 iF Donoso et alT ^141 : DiPomneo et al. 12014 ). 

Not only is classification important, but photometric redshift prob¬ 
ability de nsity functions (PDFs) are also useful for many applica¬ 
tions te.e. iMvers et al.l200^ . l2007^ . The ability to generate photo¬ 
ns for quasars has improved with deeper an d more precise multi¬ 
filter photometry dRichards et al.l 1200 lal S iBudavari et al.l 1200 ll : 
IBovv et al.ll2012E . 


Hubble Fellow, John N. Bahcall Fellow 


The problem of photometric quasar classification is well 
studied, and often involves breaking up samples into redshift 
bins that allow the classification schemes to also serve as 
broad redshift indicators 


Suchkov. Hanisch & Margoi 


(e. B. [Richards et all |2004 l2009alH: 
3 I 2 OO 5 I: iBalleTaD I 2 OOI l2008h . 


Bow et al] (I 2 OI ih developed one of the most successful current 
quasar classification techniques (XDQSO), by using a large 
number of Gaussians to model the flux space of quasars and 
stars/unresolved galaxies (incorporating missing and/or noisy 
data using extreme deconvolution, XD; [Bovy. Hogg & Roweij 
l201lh to assign quasar probabilities in broad redshift bins. This 
method was successfully applied in selecting quasars (primarily at 
z > 2) for the Baryon Oscillation Spectroscopic Surve y (BOSS; 
IRoss et ai]|2012l : [Dawson et ^l2013h .l Bovv et S] (l2012h extended 
this method to incorporate redshift information into the model 
(XDQSOz) such that it can be integrated analytically to provide 
quasar probabilities over a rbitrary redshift ra nges and generate 
photometric redshift PDFs. IBovv et ^ (12012h also incorporated 
ultraviolet (UV) and near-infrared (NIR) forced-photometry at 
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2 DiPompeo et al. 


SDSS source positions into the model, which improved quasar 
classification and essentially broke all redshift degeneracies 
evident when using phot ometry from t he Sloan Digital Sky Survey 
lSDSS:lYork et al.l 200(t) uari z filters jpukueita et alJll99^ alone. 


While 


Bow et al. ^2013) illustrated the power of NIR pho¬ 


tometry in the XDQSOz method, its use is limited because of the 
shallow depths and relatively smaller areas of current NIR surveys 
(see section 2.2.4). It is now wel l established that the mid-IR is 
an ef ficient identifier of quasars dStem et alj l2005l : iHickox et alJ 
l2007h . and the public a ll-sky data from th e Wide-Field Infrared 
Survey Explorer (H75i?: lwright et ^l201(]h allows the use of this 
method to identify very large samples. With no additional infor¬ 
mation, over redshifts from 0 < 2 < 3, WISE colors can select 
luminous quasars at comple teness levels near 90% (at conserva - 
tive flux limits of m < 15; IStern et ^l2012l : lAssef et d]l2013h . 
These techniques have be en used to classify the fi rst large sam¬ 
ples of obscured quasars jMateos et al.ll2012l . l2013h and begin to 
statistically compare obscured and unobscu red quasar populations 
JPonoso et alj|20l4 IPiPompeo et alJl2oT^ . 

In this paper, we incorporate the mid-IR data of WISE in the 
XDQSOz model of the relative-flux-redshift density, and show that 
these data dramatically increase the power for XDQSOz to identify 
quasars, especially at high redshift, as well as improve photomet¬ 
ric redshift estimation. We also present a new catalogue of quasar 
probabilities for point sources in SPSS PR8, including a potential 
quasar catalogue for objects with probabilities above 0.2 that in¬ 
cludes redshift PPFs. We caution that this catalogue is probabilis¬ 
tic in nature, and not suitable as a statistical sample (at least not in 
its entirety — statistical subsamples can be selected from the full 
catalogue). Section 3.3 discusses this in more detail. 


2 METHODS & DATA 

2.1 Photometric classification and redshift estimation 


Details of the method used to calculate quasar/st ar probabili t ies for 
obj ects and redshift PD Fs are presented fully in iBovv et alj l l201lh 
and iBovv et al.l (l2012h . We refer the reader there for complete de¬ 
tails, with a brief summary of the general considerations provided 
here. 

The important factor for photometrically classifying quasars 
and estimating redshifts is the joint probability of an object’s fluxes, 
redshift, and the possibility that it is a quasar: p(flux, z, quasar). 
This can be written in many different ways depending on the ap¬ 
proach, but XDQSOz is based on: 

p(tiuxes, 2 , quasar) = p(iluxes, 21 quasar)P(quasar). (1) 


Under the assumption that an object is a quasar, photometric red¬ 
shifts are given by: 


p( 2 |fluxes, quasar) 


p(fiuxes, 2 , quasar) 
p(fluxes, quasar) 


( 2 ) 


To determine if an object is a quasar in a given redshift range, we 
integrate the joint probability over redshift: 


P(quasar in A 2 |tluxes) = / d 2 p(quasar, 2 |fluxes) (3) 
J Az 


where 



p(quasar, z, fluxes) 
p( fluxes) 


(4) 


p(fluxes) = p(fluxes, quasar) + p(fluxes, not a quasar) (5) 


and p(fluxes, not a quasar) is found by modeling the fluxes of non¬ 
quasars (the stellar sample described below). The probability that 
an object is a quasar at any 2 is calculated with Az = [0, oo]. 
Given that the quasar training set is in the range 0.3 < z < 5.5, 
in practice these are the redshifts that should be considered and are 
the limits applied when we refer to a quasar at “any redshift”. 

2.2 Training data 

The training sets use d are essentially identical to those of 
IBovv et aT] OOlll) and IBovv et aD ( 1201 2h . with the addition of 
WISE data described in section 2.2.5, and so we only provide a 
brief summary of these, referring the reader to the original papers 
for full details. 


2.2.1 Sloan Digital Sky Survey optical data 

The SDSS imaged 10,000 deg^ of the northern and southern 
Galactic sky in u, g, r, i, and 2 filters. SDSS-III extended this by 
~2500 deg^ in the southern Galactic cap dEisenstein et alj201 ih . In 
addition, the SDSS has taken follow-up spectroscopy of millions of 
sources, which have generated several catalogues of various source 
subclasses. All of the optical data used to train the XDQSOz algo¬ 
rithms are based on SDSS imaging and spectroscopy. 


2.2.2 Quasars and stars 


Our goal is to classify point-like objects as either stars or quasars. 
This of course assumes that all resolved sources are galaxies, au¬ 
tomatically removing them from the samples, or conversely that 
all quasars are unresolved. While not strictly true, this is largely 
handled by limiting the redshifts probed to 2 ^ 0.3. Limiting to 
unresolved sources will discard a significant fraction of optically 
faint obscured (“type 2”) quasars, but should not strongly impact 
optically bright unobscured (“type 1”) sources. 

The quasar training data consists of 103,601 spectroscopically 
confirm ed quasars from the SD SS data release 7 (DRV) quasar cat¬ 
alogue dSchneider et aljEoidl with 2 0.3. While the training is 

performed in bins of SDSS i-band magnitude, all of the quasars 
are included in each bin in order to have large enough numbers 
to properly constrain the fits. Thus the quasar flux in each fil¬ 
ter is rescaled based on its i-band magnitude and redshift, using 
an apparent-magnitude-dependent redshift prior. The prior is ob- 
tained by integrating a model of the quasar luminosity function 
dHopkins, Richards & Hemguisli l2007tn over the approp riate ap¬ 
parent magnitude bin (see Figure 1 of Bow et ^l201 2l and sec¬ 
tion 2.2.2 for more details). This is used to re-weight the quasars 
in each i-band bin so that the weighted histogram of the quasars 
reproduces the prediction of the luminosity function. This method 
also works under the assumption that quasar colors are indepen¬ 
dent of their absolute magnitude. While there are known correla¬ 
tions between q uasar properties and luminosity (e.g. lBaldwinI 19771 : 
lYip et ^120041) . these effects are small compared to the intrinsic 
scatter and are generally washed out in broadband colors. 

To model the flux space of stars, we use a large set of 
point sources from the co-added photometry of SDSS Stripe 82 


^ Preliminary work with BOSS quasars suggests that this luminosity func¬ 
tion may be incomect above 2 ~ 2.5 (Ross, private communication). At 
present, this is still the best choice, but it is possible future updates may be 
needed. 
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jAbazaiian et alj2009l) that are selected to have low variability, be- 
cause essentially all qu asars are variable and most stars are not 
dKirkpatrick et all 1201 ih . This identifies a stellar training set of 
701,215 objects. Of those with available spectra (23,540), only 221 
are quasars, indicating that the contamination is extremely low. Ad¬ 
ditionally, as unresolved galaxies are also not highly variable, these 
objects are included as part of the stellar training set and thus ac¬ 
counted for when the quasar probability is calculated. 

2.2.3 Galaxy Evolution Explorer UV data 

We utiliz e data from the all-sky Galaxy Evolution Explorer 
(GAI.EX: [Martin et al.ll200^ in the near- and far-UV (NUV and 
FUV). Rather than using GALEX catalogue fluxes, we use values 
force-photometered from GALEX images at SDSS source positions 
dAihara et alj|201 ll) . This allows us to obtain low signal-to-noise 
PSF fluxes of objects that are not detected by GALEX, and thus 
not included in the GALEX cat alogues. One update in this work as 
compared to lBovv et al.l ( 12012h is that the GALEX forced photom¬ 
etry was not complete at the time of the original analysis, and so 
UV data for more objects are included here — 81,011 quasars have 
NUV data, and 68,523 have FUV data, with the numbers increas¬ 
ing slightly for those that only have upper limits. This additional 
UV data in the training set improves XDQSOz slightly on its own 
(see section 3.1). The distributions of the UV signal-to-noise ratio 
(SNR) of these sources are shown in the middle panel of Figure[T] 

2.2.4 UKIRT Infrared Deep Sky Survey NIR data 

The UKIRT Infrared Deep Sky Survey (UKIDSS) covers ~4000 
deg^ of the SDSS footprint in the Y, J, H, and K NIR filters. 
We make use of this imaging, again force-photometered at SDSS 
source positions. We find that ~ 26, 000 objects have complemen¬ 
tary fluxes in all of these filters, again with the number increasing 
when only those with upper limits are considered. The numbers of 
objects with flux measurements per band are 26,487 (V); 26,450 
(J); 26,486 ( 77); 26,561 (K). Again, these numbers are slightly 
updated from IBovv et ^ ( l2012h . though the differences are quite 
small. The distributions of SNR values for the UKIDSS imaging 
are shown in the bottom panel of Figure[T] 

2.2.5 Wide-Field Infrared Survey Explorer MIR data 

WISE has mapped the entire sky multiple times, in four bands cen¬ 
tered at 3.4, 4.6, 12, and 22 pm (Wl, W2, W3, and W4 respec¬ 
tively). The 5(7 limit in each band is at least 0.08,0.11, 1, and 6 mjy, 
respectively, and improves toward the ecliptic poles where the ob¬ 
serving strategy leads to deeper imaging. The angular resolutions in 
each band are 6.1, 6.4, 6.5, and 12 arc sec, respectively. WISE has 
released two full-sky source catalogs — however, again in order 
to use the full power of the XD method and incorporate informa¬ 
tion on sources below the flux limit of the survey, we make use of 
forced photometry of the WISE All-Sky Release imag ing at SDSS 
positions (to be included in a future SDSS release; see lLangr2014l ; 
iLang. Hogg & Schlegelll20l4 . 

We utilize the two most sensitive bands in WISE, W 1 and W 2, 
also the two bands most efficient for selecting quasars via their IR 
colors. We find 103,050 quasars with Wl fluxes and 103,019 with 
W2 fluxes. The SNR distributions for these sources are shown in 
Figure [T] Not only does WISE have the most data for the training 
set, it is also the deepest. 


One possible complication with using WISE forced- 
photometry is that it is performed using source positions from 
DR12 imaging while this work uses DR8 imaging. In addition to 
the known astrometry er ror that was cor rected with DR9 (espe¬ 
cially above Dec ~ 41°; lAhn et al.ll2oT2h . some objects have up¬ 
dated “primary” imaging from DR8 to DR 12, causing them to shift 
positions or change primary run number. Our catalogue is built 
by running XDQSOz on all available data for an object, matching 
objects in the forced-photometry catalogues by run. If an object’s 
primary SDSS observation changes runs, we may not find its cor¬ 
responding WISE data. However, testing reveals that these issues 
affect a very small fraction of objects — over 99.9% of DR8 pri¬ 
mary point sources match to force-photometered WISE sources in 
the same runs in DR12. Of the unmatched objects, ~85% are below 
the SDSS i-band flux limit, and so are more likely to be spurious 
sources. The remaining missing sources are likely due to astrome¬ 
try changes, but these are an essentially negligible fraction in our 
final training set and catalogue (section 3.3). 

The different epochs of the surveys utilized mean that obser¬ 
vations span over 10 years in the observed frame (over 3 years 
at 2 = 2). Typical UV/optical variability over thes e rest-frame 
times cales is 0.1-0.2 mag, depending on luminosity jHook et alJ 
Il994h . Variab ility in the IR over the se timescales is also ~ 0.1 
mag (J-band; iKoztowski et alJl2010h . These variability levels are 
less than than the average errors for individual filters, and definitely 
less than the range in colors, and are therefore largely unimportant 
for this work. 

2.3 Relative-flux-redshift density model 

XDQSOz models the relative-flux-redshift distribution with a 
large number of Gaussians by deconvolving the distribution 
for the training sets describ ed above with the XD technique 
(IBovv. Hogg & Roweisll201lh . This method is uniquely suited to 
the data available for modeling quasar flux space, because not all 
objects have all fluxes available and the uncertainties are hetero¬ 
geneous. XD assume s that the flux uncertaint ies are Gaussian (as 
they are for the SDSS; livezic et al]|2003Ll^Q07h . The spectroscopic 
redshifts are assumed to have nu ll uncertainties becaus e the typical 
values are on the order of 10“® dSchneider et al. 120101) . far smaller 
than the er rors on phot o metric redshifts. 

As in IBovv et ^ ( |2012|) . we use a sum of 60 Gaussians to 
model the deconvolved flux-redshift density. All fits are performed 
in 0.2 mag wide bins of i-band magnitude, starting at i = 17.7 up 
to i = 22.5 (for 47 overlapping bins centered 0.1 mag apart). All 
of the quasars are used in the fit to each bin, by rescaling the fluxes 
according to the luminosity function as described in section 2.2.1. 
The model contains a total of 47 x (60 x [l-l-d-l-d(d-l-l)/2] —1) pa¬ 
rameters — for five SDSS fluxes, two GALEX fluxes, four UKIDSS 
fluxes, and two WISE fluxes, this is a total of 296,053 parameters 
(plus an additional 85,493 from the stellar fits from XDQSO with 
all of the available fluxes). 

Figures[3and[3]show the flux-flux (W 1 and W2 versus SDSS 
r) and color-color (r — Wl versus W1 — W2) plots in one z-band 
bin (18.6 ^ i < 18.8) for a random re-sampling from the best-fit 
XD model (convolved with actual data errors, left columns) and 
the data these fits are based on, resampled according to the quasar 
luminosity function (right columns). Note that all of the WISE data 
are kept in the native Vega magnitude system. Figures|4]and|5]show 
similar plots comparing samplings of the redshift fits to the red¬ 
shifts of the real data. The resampled error-convolved fits to the 
data are nearly identical to the real data in all cases, illustrating 
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Figure 1. Cumulative distributions of the SNR for WISE, GALEX, and 
UKIDSS photometric data of spectroscopically confirmed quasars. WISE 
has data available for more sources, and is also deeper than the other sur¬ 
veys. 


the accuracy of the fits. We also point out in Figure [5] that above 
z ^ 3 the colors fall below W1 — W2 = 0. 8, a common cut u sed 
to select quasars from WISE data alone fe.g. lStern et alj[TQ12l see 
section 3.2) 



Figure 2 . W1 (top) and W2 (bottom) versus SDSS r (all normalized by 
the 2 -band flux), illustrating a projection of the space in which the fits are 
performed. The left columns show a random sampling from the XD fits 
with errors from the real data added in, and the right columns show the 
real quasar data resampled according to the quasar luminosity function (see 
sections 2.2.1 and 2.3). Note that the WISE fluxes retain their native Vega 
zero-point, while the SDSS fluxes are in AB, hence the large relative WISE- 
SDSS fluxes. 


8 

6 
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Figure 3. The same as Figure[2| but for r — W1 versus W1 — W2 colors. 


I—deconvolution with data errors resampled quasar data 

.■.. 8 .. 



3 RESULTS & DISCUSSION 
3.1 Performance 

Sections 5 and 6 of IBovv et aP ( 1201 2h already demonstrated 
that XDQSOz outperforms many other methods when calculating 
quasar probabilities and photometric redshifts. Here, we simply 
present the improvements with the addition of WISE flux informa¬ 
tion. In order to assess the performance of XDQSOz, we calculate 
the quasar probability Pqso for the known SDSS DR7 spectro¬ 
scopic quasars as well as the stars in our training set. We use the 
entire samples for all of the below analysis, including additional 
fluxes where they are available (and the analysis requires them). 

First, we present the results using the broad redshift bins of 
the original XDQSO (z < 2.2, 2.2 < 2 < 3.5, and 2 > 3.5), 
shown in Figures (quasars) and [7] (stars). These figures show a 
comparison between using only SDSS fluxes, SDSS+H75£, and 


SDSS+GALEX-fUKIDSS-FWKE. The percentages of the sub¬ 
samples belonging to the redshift bins of interest that are identified 
as quasars at P > 0.5 and P > 0.8 are given in each panel for easy 
comparison. 

The method does an excellent job identifying quasars in the 
appropriate redshift bins, and not as stars, even with only the SDSS 
fluxes. However, there is significant improvement with the addition 
of WISE information, especially at 2 > 2.2. At low redshift, gains 
are on the order of ~5%. In the mid-redshift range, the improve¬ 
ment is ~15-20%. At high redshifts, the improvement is ~10-15%. 
The addition of GALEX and UKIDSS data adds improvements of a 
few percent at most over the SDSS-FVV75P information — adding 
WISE fluxes has the single greatest effect on the performance of 
XDQSOz when classifying quasars. Similar results are found when 
testing the stellar training set (Figure |3- Very few stars are iden¬ 
tified as quasars, with a significant improvement when WISE data 
are added and only a marginal further increase with GALEX and 
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SDSS SDSS + WISE SDSS + GALEX + UKIDSS + WISE 





Figure 6. The XDQSOz probability that a spectroscopically confirmed quasar is a quasar in broad redshift bins, using just SDSS photometry (left), 
SDSS+W/SF photometry (center), and SDSS+GAL£X+UKIDSS+)WSii photometry (right). Within each panel are the distributions of the probabilities 
that objects are low-redshift quasars {z < 2.2, first panel), medium-redshift quasars (2.2 < z < 3.5, second panel), high-redshift quasars {z > 3.5, third 
panel), or stars (bottom panel). Line styles indicate the spectroscopic classification of each subset. The percentage of objects that belong to each redshift bin 
of interest and meet the thresholds Pqso > 0.5 and Pqso > 0.8 are listed in each panel. In the case of Pstar, these percentages are for quasars at all redshifts. 
Similar figures showing additional combinations of data are provided in the appendix. 

SDSS SDSS + WISE SDSS + GALEX + UKIDSS + WISE 





Figure 7. The XDQSOz probabilities for the stellar training set, using various combinations of photometric data. The top panels shows the probability that an 
object is a quasar at any redshift, and the bottom shows the probability that it is a star. Note that there is some contamination from quasars in the stellar training 
set (on the order of ~1%). Similar figures showing additional combinations of data are provided in the appendix. 


UKIDSS information. Similar plots for other combinations of data 
are provided in the appendix for comparison. 

The top panels of Figure show a similar analysis, using 
probabilities that an object is a quasar at any redshift. With just 
the SDSS data alone, ~90% of the quasars are recovered with 
Pq&o > 0.8. The addition of WISE photometry improves this dra¬ 


matically, increasing the fraction with Pqso > 0.8 to ~97%. The 
further addition of GALEX and UKIDSS data only increases the 
performance by another ~1%. Without the use of WISE these per¬ 
centages only reach ~93% for any other combinations of data (see 
the appendices). 

Because our training set has additional GALEX and UKIDSS 
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extreme-rdeconvolution with 



0 1 2 3 4 5 

redshift 




Figure 4. The same as Figure [2| but for W1 and W2 versus z, again 
showing a projection of the space in which fits are performed. The WISE 
fluxes retain their native Vega zero-point, hence the large relative WISE- 
SDSS fluxes. The vertical lines indicate the redshifts at which various strong 
quasar emission lines fall within any of the relevant filters {i, Wl, or W2): 
solid is Lya, dotted is CIV, dashed is C III], dash-dotted is Mg II, dash-dot- 
dot-dotted is Hck. None of these lines have a strong effect on the behavior 
of the normalized W1 or W2 fluxes as a function of redshift. 


resam|;)lcd c|uasar data 



- w 1.0 



redshift 


1 2 3 4 5 

redshift 


Figure 5. The same as Figure[4] but for W1 — W2 colors instead of fluxes. 
The vertical lines indicate the range of redshifts for which Ha lies in the 
W 1 filter bandpass. 


data not available in the training set of IBovv et ^ ll2012h . we 
explicitly compare the percentage of known quasars identified at 
Pqso > 0.8 using our new fits using SDSS+GALEX da ta and 
SDSS+UKIDSS data and the same fits of iBovv et alj ( 1201 2h avail¬ 
able from the previous XDQSOz release. The new data do not 
greatly alter the fits, but the changes are large enough such that 
the new fits improve the ability of XDQSOz to identify quasars by 
-0.3% for SDSS+GAL£X, and -0.1% for SDSS-fUKIDSS. 

The bottom panels of Figure [8] show the comparison of the 
peaks of the redshift PDFs versus the spectroscopic redshifts. These 
panels include all objects in all of the panels, regardless of which 
data are available for a given source (e.g. an object is included in 
the panel using fits to all of the fluxes even if it doesn’t have all of 
the fluxes available) or if its redshift PDF has multiple peaks (see 
below). There is a clear improvement over just SDSS data with the 


addition of WISE, and again only a marginal further improvement 
by adding GALEX and UKIDSS data. The groups of points that 
are far off-axis and most strongly present in the SDSS-only panel 
are largely due to strong quasar emission lin es (e.g. Mg II or C IV) 
moving between filters (see e.g. Figure 12 of iBall et alToOSh . and 
additional data at other wavelengths helps to reduce these affects. 

To better illustrate the benefit of WISE, Figure shows the 
difference between the peak photometric redshifts and the spec¬ 
troscopic redshifts for several combinations of data. The use of 
WISE data clearly has the single largest effect, with smaller ad¬ 
ditional gains from including GALEX and UKIDSS. In the case of 
SDSS+MSii,—95% of the objects have Az = |2spec —Zphotl < 0.3 
and —76% have Az < 0.1. The large bumps around |A 2 | = 1.5 
in most combinations of data without WISE represent the off-axis 
clumps discussed above, and are largely removed with the addi¬ 
tional WISE data because only one emission line (Ha) affects the 
WISE bandpasses, and only at higher redshifts {z > 3.5; see e.g. 
Figure |4l(,. 

If we limit the analysis of the photometric redshifts to only 
objects with photometry available in all filters, and only those that 
have single-peakec0 redshift PDFs, the performance is outstanding. 
This is shown in FigurefTOl In this case, —98% of the 21,929 objects 
have Aa < 0.3, and —84% have Aa < 0.1. 

Figure [TT] shows the redshift PDFs for four known quasars 
with various combination s of data. These ar e the same objects that 
are shown in Figure 12 of iBovv et alj ilOl'j) . In general we see that 
the addition of WISE data has the power to significantly narrow the 
PDF around the known spectroscopic redshifts. However, it is not 
always the case that SDSS+lV/Sii has a similar accuracy as when 
all of the fluxes are used. The addition of GALEX and UKIDSS 
data often has a significant effect on the photo -2 estimates. The 
appendix includes some examples of the effect of WISE data on the 
redshift estimation of high-redshift (2 > 3) quasars. 


^ A peak in the redshift distribution is defined as any continuous region 
above the uniform distribution between 0.3 < 2 < 5.5 



(Peal< Zp^„,) - 

Figure 9. The distribution of the difference between the peak of the photo¬ 
metric 2 PDF and the spectroscopic 2 for the DRV quasar sample, using sev¬ 
eral different combinations of data. Using WISE fluxes clearly narrows the 
distribution significantly, more so than using SDSS-I-GALFX-I-UKIDSS. 
Gains from using all of the available data are small compared to using just 
SDSS+WISE. 
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Figure 8. The three panels show the distributions of Pqso over all redshifts for known spectroscopic quasars from SDSS DR7 (top panels), and a comparison 
of the spectroscopic redshifts and the peak of the photo- 2 : PDFs (bottom panels). From left to right, the results using just SDSS, SDSS and WISE, and all 
available photometry. Similai' figures showing additional combinations of data are provided in the appendix. 



1 2 3 4 5 

^spec 


Figure 10. A comparison of the photometric versus spectroscopic redshifts, 
limited to only objects with all of the fluxes available and with single- 
peaked redshift PDFs. The dashed line shows the one-to-one line, and the 
dotted lines indicate /\z = 0.3. 


3.2 Comparison with WISE color selection 

How does the use of WISE photometry in the XDQSO method com¬ 
pare with pure H75ii-color cuts to select AGN? Of course, with the 
current XDQSOz formalism (and the use of forced-photometered 
WISE data at SDSS positions) we are limited to objects with opti¬ 
cal SDSS detections, and so we cannot directly compare the meth¬ 
ods for selecting obscured objects that fall below the limits of 
SDSS due to obscuration of the quasar. However, it is fair to com¬ 
pare the two methods in identifying quasars with optical detec¬ 
tions, as WISE colors are used to identify samples of unobscured 
quasars for comparison with obscu red quasars (e.g. iDonoso et alj 
l2014l ; [DiPompeo et alj|20l4l2015h . To examine this question, we 
start with a test sample of DR8 point sources and spectroscopi- 
cally confirmed quasars from the DR7 an d DR 10 quasar catalogues 
jSchneider et alj|201(]l : IFiris et alj|2014h in a circular region cen¬ 
tered at RA=180°, Dec=40° with a radius of 10°. The use of both 
quasar catalogues allows us to analyze the methods at high and 
low redshifts, as DRV was effectively limited to i < 20.2 for high- 
redshift quasars, while the BOSS survey (included in DRIO) specif¬ 
ically targeted z > 2.2 quasars with g ^ 22 OR r < 21.85. After 
applying the flag cuts used in the construction of the catalogue pre¬ 
sented here (section 3.3) and limiting to sources with W 1 and W2 
forced photometry available, there are 904,571 point sources and 
8,280 spectroscopically confirmed quasars in this region. 

We analyze both the completeness (Aqso, sei/Aqso) and the 
selected target density (Appoint, sei /area) for various cuts of Pqso and 
W1 — W2 (alw ays applying a cut at HV2 < 15 in the latter case; 
IStem et ^|2012|) . Note again that all WISE magnitudes are in the 
Vega system. This analysis is performed separately for two red- 
shift ranges, z < 1 and 2 > 2.5. The results are shown in Fig¬ 
ure We can see that the XDQSO method is far more complete 
than the simple WISE color cuts, especially for high redshift ob- 
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Figure 11. Examples of redshift PDFs for known spectroscopic quasars using various combinations of photometric data. 


jects where the completeness is nearly an order of magnitude better. 
This is not particu l arly surprising, as it is well noted in t he wo rk of 
IStem et ^ (l2005h . lstern et al] ( |2012|) . and lAssef et alj ( 1201 3h that 
applying a simple color cut finds virtually no quasars at 2 > 3. Fig¬ 
ure shows that for SDSS spectroscopic quasars, this is largely 
due to the W2 limit — most SDSS quasars with FF2 > 15 are at 
2 > 2. This is at least partially because the higher 2 SDSS quasars 
are, on average, optically (and therefore probably bolometrically) 
fainter. However, Figurej^also shows that most high -2 objects have 
W1 — W2 colors that are too blue to make the cut. At lower red- 
shift, using simple co lor cuts we find a completeness of ^15%, in 
rough agreement with lStem et alj | |2012|) . The most striking feature 
in these plots is that the most conservative cuts in XDQSOz begin 
where the WISE color cuts plateau. 

For a given target densitjl^the XDQSO selection is more com- 

® We note that the ta rget density at a cu t of W1 — W2 > 0.8 is lower 
than what is found in I Stem et al ](2^. This is because we require op¬ 
tical detections in the SDSS, which means that we are missing the most 
heavily obscured W/j'E-selecte d AGN. These objects may make up nearly 
half of the quasa r population iHickox et alJIlOO^ iDiPompeo et alj|2014l : 
lAssef et~5' .|20 i4 


plete, indicating that this method is also more efficient. However, 
over all redshifts, to reach a completeness level above 90% requires 
target densities over 100 per deg^. 


3.3 Quasar catalogue and code 

We release an u pdated version of the quasar catalogue of 
!Bovv et aT] ( 1201 ih . with the additional use of WISE data in 
our fits to the flux-redshift density space. The catalogue in¬ 
cludes quasar/star probabilities for all SDSS DR8 point sources 
(objc_type = 6) with a reasonable detection (extinction- 

corrected magn itude in at least one band is above the com¬ 
pleteness limit: lAihara et al]|201 ll) . a dereddened i-band magni¬ 
tude in the range 17.75 ^ i < 22.45, and with Pqso > 
0.2. A catalogue of quasar/star probabilities for all of the SDSS 
DR8 point sources is available upon req uest. The c a talogu e 
includes the entries listed in Table 2 of IBovv et Tl (1201 ih . 
along with 6 additions: galex_matched, galex_used, 
ukidss_matched, ukidss_used, wise_raatched, and 
wise_used. The “matched” tags indicate whether an object was 
detected in forced-photometry of each survey (true or false), and 
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Figure 12. A comparison of the completeness and efficiency in identifying quasars as a function of XDQSOz quasar probabilities and simple WISE color 
cuts (combined with a W2 cut), using a test ~300 deg^ region, and the DRV and DRIO spectroscopic quasar catalogues. The left and right panels show low 
(z < 1) and high (z > 2.5) redshift ranges, respectively. Colors indicate various cuts in Pqso and VFl — VF2, as shown in the color bars. Cuts at Pqso = 0.8, 
the inflection point in the total (all redshifts) quasar completeness for XDQSOz, are marked with a square, and the standard W1 — W2 cuts are indicated by 
a triangle. XDQSOz is more complete and more efficient, especially at high redshift. This is largely because of the W2 restriction (Figure [Tit . Impressively, 
XDQSOz picks up where WISE color cuts plateau. 



Z 


Figure 13. The distributions of redshift for SDSS quasars (from both the 
DRV and DRIO quasar catalogs) with W2 < 15 (solid) and VK2 > 15 
(dashed). The VF2 < 15 requirement for WISE-on\y quasar selection, 
which prevents contamination from massive, star-forming galaxies at high 
redshift, removes many high-redshift SDSS quasars as well. This at least 
partially reflects the fact that the SDSS quasar selection uses fainter limits 
at high redshift. 

the “used” tags indicate if the given survey’s fluxes were used in 
the underlying model for the prohahility calculation. 

This catalogue contains 5,53V,436 (total; 3,874,639 weighted 
by probability) potential quasars, and the following additional tags 
of redshift information: npeaks (number of peaks in the redshift 
distribution, from 0 to 6), peakz (the redshift at the highest prob¬ 
ability of the widest continuous region above the uniform distri¬ 
bution), peakprob (the probability associated with the peak red¬ 
shift), peakfwhm (the FWHM of the primary peak), other z (the 
redshifts of the peaks of up to six other regions above the uniform 
distribution), otherprob (the peak probabilities of up to six sec¬ 
ondary peaks), and otherfwhra (the FWHM of up to six other 
secondary peaks). 


The distribution of the peak photometric redshifts for this cat¬ 
alogue is shown in Figure [14] There is a peak around 2 ~ 1, where 
star/galaxy separation becomes difficult for faint objects (see be¬ 
low). Many of these sources are likely galaxies with quasar-like 
colors, and are potentially borderline objects. Somewhat surprising 
is the additional peak at z ~ 5, given that high redshift quasars are 
relatively rare. In visual inspection of the SDSS imaging around 
100 random sources with photometric redshifts of -^5, we find that 
approximately 20% have nearby bright red stars that likely con¬ 
taminate the photometry. Indeed, the majority of these objects have 
the SDSS SUBTRACT ED flaj^ set . We extend the BOSS bright 
star masl|3 (as in, e.g. IWhite ^ aklEoi iL 1201 3) across the whole 
SDSS region and apply it, which removes a significant number of 
these objects. An additional tag has been added to the catalogue, 
bright_star, to indicate objects that may suffer from contam¬ 
ination from nearby bright stars because they lie within the bright 
star mask. 

There are other problematic regions in the SDSS that may 
cause inaccurate quasar probabilities or redshifts, including fields 
with poor photometry and regions in the North Galactic Cap with 
bad M-band data. These are indicated in the catalogue in the tags 
bad_field and bad_u, respectively. IPiPompeo et alj ( l2014h 
built a mask that removes regions around flagged/contaminated 
data in the WISE catalogue, including bright WISE stars (which 


This flag indicates that the wings of the PSF o f a bright star have been 
subtracted, as described in IStoughton et al.l 120021) and at https :/ /www. 
sdss3.org/dr8/algorithms/flags_detail.php 
® The original mask covering the BOSS footprint can be 
found at http://data.sdss3.org/datamodel/files/ 
BOSS_LSS_REDUX/re ject_mask/MASK. html. Our extended 
version is provided as a MANG LE polygon file along with several 
useful WISE masks as presented in iDiPompeo et al.l fe014l) . at http: / / 
f araday .uwyo . edu/^acimyers/wisemask2014/ wisemask . 
html . Note that there is a typo (missing “~”) in the URL of lPiPomneo et al] 

iioll . 
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may or may not be bright in the SDSS). An additional tag, 
wise_f lagged, is included in the catalogue to indicate objects 
that lie in these regions. 

However, even excluding objects that fall in these various 
masks, there is still a spike in the photometric redshift distribution 
of the catalogue around « ~ 5. Another possibility we tested is 
that red galaxies that fail star-galaxy separation in the SDSS DR8 
imaging (i.e. they are unresolved) could mimic high -2 quasars and 
cause this abundance of objects. Because the “stellar” training set is 
from co-added Stripe 82 data, the average seeing is better than that 
of the full catalogue and these contaminants would likely not be 
included in the training. However, if this were the case, we should 
see the spike at 2 ~ 5 reduced for the objects in the catalogue with 
the best seeing — this is not the case. The distribution of peak -2 
values is not a function of the seeing. It is not immediately clear 
in the SDSS imaging that there are obvious problems with the vast 
majority of these 2^5 sources, but we recommend users exercise 
caution with these objects until proper follow-up is performed. 

One way to limit some of the potential contamination from 
e.g. failing star-galaxy separation is to limit the sample to relatively 
bright {g < 21.5) objects. Indeed, as shown with the dotted his¬ 
togram in Figure [TT] doing so removes the spikes in redshift and 
thus likely provides a more pure quasar sample. 

We stress that this catalogue is probabilistic in nature, so 
many of the objects are likely not quasars, and it is not intended 
to represent, on its own, a statistical sample. This is largely due 
to the heterogeneous nature of the SDSS data it is built from, 
and it is possible to compile a subset of the catalogue that rep¬ 
resents a complete statistical quasar sample. This has been done 
with previous versions of the XDQSOz catalog for an array of 
studies, including: probing the intergalac t ic medium with close 
quasar pairs (e.g . IProchaska et al ] l2013l: iH ennawi & Prqch askal 


I 2 OI 3 I : IProchaska, Lau & HennawJ 20141 : iRubin et alj 20141), ph^ 
tomet r ic clustering to probe primordial n on-Gaussianity 1 Hoet_^ 
2 OI 3 I: Agarwak Ho & Shan deral 1201 4 iLeistedt & PeirisI 20l4 


Leistedt. Peiris^^Roth 

halo fe.g. IPeason et al 


1201 


^1 


probing the extent of the Galactic 
and studying ba ryon acoustic oscil¬ 


lations at high redshift (e.g. Slosar et ^l2013h . The improved cat 
alog here is sure to be a boon to future work in these areas, and is 
also useful for cross-matching with other catalogues/wavelengths 
in order to estimate quasar likelihoods or redshifts, analyses that do 
not require statistically complete samples, and searches for unusual 
objects. 

The probabilistic quasar catalogue is available as a fits file 
(easily downloaded with a WGET command) at http: //www. 
mpia.de/homes/joe/xdqsozcat_galex_ukidss_ 
wise_p2 0 . f it s . gz. The updated XDQSO and XDQSOz codes 
for target classification and photometric redshift estimation, as 
well as the new flux-redshift density models including WISE 
data, are publicly available as a GitHub repository at https : / / 
github.com/xdqso/xdqso/. The FITS files containing the 
model are identical to those in the previous release of XDQSOz, 
with additional dimensions added to the end of each Gaussian 
component containing W1 and W2 information (in that order). 


4 CONCLUSION 

We have pre sented an update to the XDQSOz method of iBovv et alj 
1 I 2 OI 1L[2 oI ^ for quasar classification and photometric redshift es¬ 
timation that incorporates the two most sensitive WISE bands (3.6 
and 4.6/im, or W1 and W2, respectively) into the relative-flux- 


X 10® 



z 

Figure 14. The peak photometric redshift distribution for the new quasar 
catalogue (solid line). The peak around 2 = 5 can be reduced by applying 
additional cuts to the data to remove artifacts or regions of bad photometry 
(see section 3.3), but many of the sources appear real in optical images. 
Additional follow-up is needed to determine if they are truly high -2 quasars. 
Both this peak and the one at 2 ~ 1 are removed by limiting to relatively 
optically bright objects (g < 21.5; dotted line), which likely removes faint, 
possibly borderline quasars that fail star-galaxy separation. 


redshift density model. The use of WISE information greatly en¬ 
hances the precision of the method in identifying quasars (by -^5- 
20 %, depending on the redshift range of interest), especially at 
2 > 2. It also improves the overall accuracy of the photometric red¬ 
shift estimation. This method has better completeness compared to 
simple WISE color cuts (for quasars with lower levels of obscura¬ 
tion such that they are still detected and unresolved by the SDSS), 
again especially at high redshift, at similar efficiency. We present a 
catalogue of potential quasar candidates (Pqso ^0.2) with photo¬ 
metric redshift estimates. This catalogue can be a powerful tool for 
identifying quasar samples or estimating redshifts for a wide vari¬ 
ety of studies, and the improved accuracy will benefit future large 
quasar surveys. 
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APPENDIX A: ADDITIONAL COMBINATIONS OF DATA 

Given that this work uses updated catalogues of GALEX and 
UKIDSS forced-photometry, in addition to the new WISE data, we 
have performed fits to the relative-flux-redshift space using all pos¬ 
sible combinations of data so that the appropriate model can be 
used for any combination of available data. These are included in 
the new release of the, XDQSOz code. Figures lAT1lA2l and lA3l show 
the distributions of probabilities returned for the known training 
sets and the photometric versus spectroscopic redshifts using these 
various combinations of data, for comparison with Figures [S] [7] 
and[8] 


APPENDIX B: HIGH-REDSHIFT QUASARS 

One of the motivations for incorporating WISE data into the 
XDQSO training is because of the utility of the mid-IR in finding 
highly reddened and/or high-redshift quasars that are missed using 
pure optical selection. Here we illustrate the improvement in con¬ 
straining the redshifts of high-z quasars when WISE fluxes are in¬ 
corporated. Because these are examples using known spectroscopic 
quasars, they are fairly bright in the optical — the use of WISE data 
will likely show an even more dramatic improvement in identifying 
these objects when the optical data is fainter. 
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Figure Al. The same as Figure[^ illustrating the performance of XDQSOz on known quasars in broad redshift bins but for other combinations of photometric 
data. 


© 2014 RAS, MNRAS QQQ.mim 





























































































Fraction 


XDQSOz with WISE 13 


SDSS + GALEX 
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Figure A2. The same as Figure[7] illustrating the performance of XDQSOz on known stars but for other combinations of photometric data. 


© 2014 RAS, MNRAS QQQ.FnfTTI 
































Fraction 


14 DiPompeo et al. 


SDSS + GALEX SDSS + UKIDSS SDSS + GALEX + UKIDSS 








SDSS + GALEX + WISE 


SDSS + UKIDSS + WISE 


1.000 ■ 


■2 0.100 ■ 


—I—1—1—1—I—1—1—1—I—1—1—■—I—1—1—1—I—1—1—1—r 

P{qso) 

98.5% P > 0.5 
97.75% P > 0.8 


0.010 ■ 


0.001 





Figure A3. The same as Figure[3 illustrating the performance of XDQSOz on known quasars at any redshift but for other combinations of photometi'ic data. 
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XDQSOz with WISE 15 
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Figure Bl. Examples of the effect of WISE information on the photometric redshift estimation for quasars at high redshift (z = 3.5, top left; z = A, top right; 
z = 5, bottom). 
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